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Abstract 



[ Solomonoff 's central result on induction is that the prediction of a universal 

' semimeasure M converges rapidly and with probability 1 to the true sequence 

■ generating predictor fi, if the latter is computable. Hence, M is eligible as 

I a universal sequence predictor in case of unknown fi. Despite some nearby 

OO ' results and proofs in the literature, the stronger result of convergence for 

all (Martin-Lof) random sequences remained open. Such a convergence result 
("^ , would be particularly interesting and natural, since randomness can be defined 

in terms of M itself. We show that there are universal semimeasures M which 
do not converge to ^ on all /i-random sequences, i.e. we give a partial negative 
answer to the open problem. We also provide a positive answer for some 
d I non- universal semimeasures. We define the incomputable measure D as a 

mixture over all computable measures and the enumerable semimeasure W as 
a mixture over all enumerable nearly-measures. We show that W converges 
to D and D to /i on all random sequences. The Hellinger distance measuring 
closeness of two distributions plays a central role. 
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1 Introduction 

"All difficult conjectures should be proved by reductio ad absurdum argu- 
ments. For if the proof is long and complicated enough you are bound 
to make a mistake somewhere and hence a contradiction will inevitably 
appear, and so the truth of the original conjecture is established QED. " 

— Barrow's second law' (2004) 

A sequence prediction task is defined as to predict the next symbol Xn from 
an observed sequence x = xi...Xn-i- Tfie key concept to attack general prediction 
problems is Occam's razor, and to a less extent Epicurus' principle of multiple expla- 
nations. The former/latter may be interpreted as to keep the simplest/all theories 
consistent with the observations Xi...Xn-i and to use these theories to predict Xn- 
Solomonoff |Sol64t ISol78] formalized and combined both principles in his univer- 
sal a priori semimeasure M which assigns high/low probability to simple/complex 
environments x, hence implementing Occam and Epicurus. Formally it can be rep- 
resented as a mixture of all enumerable semimeasures. An abstract characterization 
of M by Levin |ZL70j is that M is a universal enumerable semimeasure in the sense 
that it multiplicatively dominates all enumerable semimeasures. 

Solomonoff's |Sol78] central result is that if the probability yu(x„|xi...x„_i) of ob- 
serving Xn at time n, given past observations computable function, then 
the universal predictor M„:=M(xn|a;i...x„_i) converges (rapidly!) with ^-probability 
1 (w.p.l) for n— i>oo to the optimal/true/informed predictor := /i(x„|xi...x„_i), 
hence M represents a universal predictor in case of unknown "true" distribution /x. 
Convergence of M„ to w.p.l tells us that M„ is close to /x„ for sufficiently large n 
for almost all sequences xiX2---- It says nothing about whether convergence is true 
for any particular sec\n.eYice (of measure 0). 

Martin-Lof (M.L.) randomness is the standard notion for randomness of individ- 
ual sequences |ML66l [LV97j . A M.L. -random sequence passes a// thinkable effective 
randomness tests, e.g. the law of large numbers, the law of the iterated logarithm, 
etc. In particular, the set of all /i-random sequences has /x-measure 1. It is natu- 
ral to ask whether converges to /i„ (in difference or ratio) individually for all 
M.L. -random sequences. Clearly, Solomonoff's result shows that convergence may 
at most fail for a set of sequences with /i-measure zero. A convergence result for 
M.L. -random sequences would be particularly interesting and natural in this con- 
text, since M.L. -randomness can be defined in terms of M itself |Lev73j . Despite 
several attempts to solve this problem |Vov87t IVLOOl IHut03b] . it remained open 
[HutOScj . 

In this paper we construct an M.L. -random sequence and show the existence of 
a universal semimeasure which does not converge on this sequence, hence answer- 
ing the open question negatively for some M. It remains open whether there exist 
(other) universal semimeasures, probably with particularly interesting additional 
structure and properties, for which M.L.-convergence holds. The main positive con- 
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tribution of this work is the construction of a non-universal enumerable semimeasure 
W which M.L. -converges to /i as desired. As an intermediate step we consider the 
incomputable measure D, defined as a mixture over all computable measures. We 
show M.L. -convergence of predictor W to D and of D to /j,. The Hellinger distance 
measuring closeness of two predictive distributions plays a central role in this work. 

The paper is organized as follows: In Section [2] we give basic notation and 
results (for strings, numbers, sets, functions, asymptotics, computability concepts, 
prefix Kolmogorov complexity), and define and discuss the concepts of (universal) 
(enumerable) (semi)measures. Section [3] summarizes Solomonoff's and Gacs' results 
on predictive convergence of M to /x with probability 1. Both results can be derived 
from a bound on the expected Hellinger sum. We present an improved bound on the 
expected exponentiated Hellinger sum, which implies very strong assertions on the 
convergence rate. In Section H] we investigate whether convergence for all Martin-Lof 
random sequences hold. We construct a /i-M.L. -random sequence on which some 
universal semimeasures M do not converge to fi. We give a non-constructive and 
a constructive proof of different virtue. In Section O we present our main positive 
result. We derive a finite bound on the Hellinger sum between fi and D, which is 
exponential in the randomness deficiency of the sequence and double exponential in 
the complexity of fi. This implies that the predictor D M.L. -converges to fi. Finally, 
in Section [6] we show that W is non-universal and asymptotically M.L. -converges to 
D, and summarize the computability, measure, and dominance properties of M, D, 
D, and W. Section [7] contains discussion and outlook. 



2 Notation Universal Semimeasures M 

Strings. Let i,k,n,tE]N = {1,2,3,...} be natural numbers, x,y,zEX* = \J'^^qX"- be fi- 
nite strings of symbols over finite alphabet X3a,h. We write xy for the concatenation 
of string x with y. We denote strings x of length i{x)=n by x = XiX2...Xn G with 
XfEX and further abbreviate Xk:n- = XkXk+i...Xn-iXn for k<n, and x<„ : = xi...x„_i, 
and e = x<i = x„+i:nG A'° = {e} for the empty string. Let u = xi:oo&X°° be a generic 
and a G X°° a specific infinite sequence. For a given sequence xi:oo we say that xt is 
on-sequence and Xt^Xt is off-sequence. x[ may be on- or off-sequence. We identify 
strings with natural numbers (including zero, A'* = WU{0}). 

Sets and functions. Q, M, iR+ := [0,cxd) are the sets of fractional, real, and 
nonnegative real numbers, respectively. denotes the number of elements in set 
S, ln() the natural and log() the binary logarithm. 

Asymptotics. We abbreviate lim„^oo[/(^)— 5'(^)] = by f{n)^-^^g{n) and say / 
converges to g, without implying that lim„_+oo5'(^) itself exists. We write f{x)<g{x) 
for f{x)=0{g{x)) and f{x)<g{x) for f{x) <g{x) + 0{l). 

Computability. A function / : § — > lRU{oo} is said to be enumerable (or lower 
semicomputable) if the set {{x,y) : y<f{x),x^^,y&Q} is recursively enumerable. / 
is co-enumerable (or upper semicomputable) if [— /] is enumerable. / is computable 
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(or estimable or recursive) if / and [— /] are enumerable. / is approximable (or limit- 
computable) if there is a computable function g:^xlN—^lR with limn^oDg{x,n) = 
fix). 

Complexity. The conditional prefix (Kolmogorov) complexity K{x\y): = min{i{p): 
U {y,p) =x halts} is the length of the shortest binary programpG {0,1}* on a universal 
prefix Turing machine ?7 with output xG A"* and input A"* |LV97] . K{x):=K{x\e). 
For non-string objects o we define K{o) :=K{{o)), where (o) G A** is some standard 
code for o. In particular, if (/j)^i is an enumeration of all enumerable functions, 
we define K{fi) =K{i). We only need the following elementary properties: The 
co-enumerability of the upper bounds K (x\i{x))<i{x)\og\X\ and i^'(?T,)^21ogn, 
and K{x\y)<K{x), subadditivity K{x)<K{x,y)<K{y) +K{x\y), and information 
non-increase K{f{x))<K{x) + K{f) for recursive f : X* ^ X* . 

We need the concepts of (universal) (semi)measures for strings |ZL70j . 

Definition 1 ((Semi) measures) We call v : X* [0,1] a semimeasure if z/(a;) > 
J2a&x^{x(^)^x^'^* > ^'^^ ^ (probability) measure if equality holds and i/(e) = l. z/(x) 
denotes the u -probability that a sequence starts with string x. Further, z/(a|x) := '^J^^" 
is the predictive u -probability that the next symbol is a&X, given sequence x&X*. 

Definition 2 (Universal semimeasures M) A semimeasure M is called a uni- 
versal element of a class of semimeasures A4, if it multiplicatively dominates all 
members in the sense that 

M e M and^i^ e M3w^ > : M{x) > Wu-J^{x) Vx G X* . 

From now on we consider the (in a sense) largest class Ai which is relevant from 
a constructive point of view (but see [SchOOt ISch02l IHut03b] for even larger con- 
structive classes), namely the class of all semimeasures, which can be enumerated 
(=effectively be approximated) from below: 

M. := class of all enumerable semimeasures. (1) 

Solomonoff |Sol64t Eq.(7)] defined the universal predictor M{y\x) = M{xy)/M{x) 
with M{x) defined as the probability that the output of a universal monotone Turing 
machine starts with x when provided with fair coin flips on the input tape. Levin 
|ZL70j has shown that this M is a universal enumerable semimeasure. Another 
possible definition of M is as a (Bayes) mixture [SoISl IZLTOl ISolTSl [LV971 IHut03bl 
IHutOSj : M(a;) = Ei.eA^2-^('')z/(x), where K{u) is the length of the shortest program 
computing function u. Levin |ZL70j has shown that the class of all enumerable 
semimeasures is enumerable (with repetitions), hence M is enumerable, since K is 
co-enumerable. Hence MgtW, which implies 

M{x) > WmM{x) > WM2-^^''^iy{x) = wlu{x), where wl^2-^^''\ (2) 
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Up to a multiplicative constant, M assigns higher probability to all x than any other 
enumerable semimeasure. All M have the same very slowly decreasing (in v) domi- 
nation constants w^, essentially because M We drop the prime from w'^, in the 
following. The mixture definition M immediately generalizes to arbitrary weighted 
sums of (semi) measures over countable classes other than M., but the class may 
not contain the mixture, and the domination constants may be rapidly decreasing. 
We will exploit this for the construction of the non-universal semimeasure W in 
Sections [5] and [61 



3 Predictive Convergence with Probability 1 

The following convergence results for M are well-known |Sol78l ILV971 IHut03al 
IHutOSj . 

Theorem 3 (Convergence of M to /i w.p.l) For any universal semimeasure 
M and any computable measure fi it holds: 

M{x'Jx<n) /i«|x<„) for any x'„ and 1, both w.p.l for n 



oo. 



The first convergence in difference is Solomonoff's |Sol78] celebrated conver- 
gence result. The second convergence in ratio has first been derived by Gacs 
|LV97j . Note the subtle difference between the two convergence results. For any se- 
quence x[.^ (possibly constant and not necessarily random), M(a;'„|a:<n)— /i(x'^|x<„) 
converges to zero w.p.l (referring to Xi^oo), but no statement is possible for 
M(x^|x<„)//i(x^|x<„), since liminf/i(x'„|x<„) could be zero. On the other hand, 
if we stay on-sequence {x[.,^=Xi;oo), we have M(x„|x<n)//i(x„|x<„) — 1 (whether 
n\x^n) tends to zero or not does not matter). Indeed, it is easy to give an 
example where M(a;'„|x<„)//i(x'„|x<„) diverges. For /i(l|a;<n) = 1 — /i(0|x<n) = |n~^ 
we get /i(Oi:„) =nr=i(l- |^~^)'^--^c = 0.450... >0, i.e. Oi;oo is /i-random. On the 
other hand, one can show that M(0<„)=O(l) and M(0<nl)=2--^("), which implies 
Mm^^n^.2-K{n)-ln^oo for n^oo {K{n)^2\ogn). 

Theorem [3] follows from (the discussion after) Lemma H] due to M{x)>w^ii{x). 
Actually the Lemma strengthens and generalizes Theorem [31 In the following we 
denote expectations w.r.t. measure p by Ep, i.e. for a function f-.X'^^M, Ep[/] = 
J2'x^.^p{xi;n) f{xi:n), whcrc J^' sums over all xi:„ for which p(xi:„) 7^ 0. Using J2' 
instead is (only) important for partial functions / undefined on a set of p-measure 
zero. Similarly Pp denotes the p-probability. 

Lemma 4 (Expected Bounds on Hellinger Sum) Let p be a measure and v be 
a semimeasure with z/(x) >w-p{x) Vx. Then the following bounds on the Hellinger 
distance ht{u,fi\u^t) ■ = J2a€x{\/ ^{a\u^t) - \/ ^alu^t) f hold: 



EE 

t=i 



(i) °° (a) °° (Hi) 

< J2^[ht] < 21n{E[exp(i^M]} < Inuj- 
t=i t=i 
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where E here and later means expectation w.r.t. fi. 

The lnw~ ^-bounds on the first and second expression have first been derived 
in |Hut03aj . the second being a variation of Solomonoff 's bound SnE[(i^(0|x<„) — 
Ai(0|a;<n))^] < |lni(7~^. If sequence xiX2..- is sampled from the probability measure 
/X, these bounds imply 

z/(x^|x<„) n{x'Jx<n) for any x'^ and 1, both w.p.l for n ^ oo, 

where w.p.l stands here and in the following for 'with yU-probability 1'. 

Convergence is "fast" in the following sense: The second bound (X]tE[/it] <lnw~^) 
implies that the expected number of times t in which ht>e is finite and bounded by 
^\nw~^. The new third bound represents a significant improvement. It implies by 
means of a Markov inequality that the probability of even only marginally exceeding 
this number is extremely small, and that J^t^t is very unlikely to exceed \nw^^ by 
much. More precisely: 

P[#{t ■.ht>£}> i(lnw-i + c)] < P[Et ht > Inw-i + c] 
= P[exp(iEt/it) > < 0iyE[exp(iEi/it)]e-"/^ < e'^/^^ 

Proof. We use the abbreviations pt = p{xt\x^t) and pi-.n = Pi- ■■■■ Pn = p{xi;n) for 
pe{p,u,R,N,...} and ht = J2xt{y/^t- y/Ikf- 
{i) follows from 

E[(vf -i)'l^<*] = E -"*(v^-i)'= E (v^-v^)' < ht 

by taking the expectation E[] and sum Y^^i- 

{ii) follows from Jensen's inequality exp(E[/]) <E[exp(/)] for f = \j2tht- 

{iii) We exploit a construction used in |Vov87l Thm.l]. For discrete 

(semi) measures p and q with Y.iVi = 1 and < 1 it holds: 

Ev^ < i-iE(v^-v^)' < ^M-lT.iVp^-V^?]■ (3) 

The first inequality is obvious after multiplying out the second expression. The 
second inequality follows from 1 — a;<e~'^. Vovk |Vov87] defined a measure Rt'- = 
y/PiMt/Nt with normalization Nt := J^xty/f^t^t- Applying ([3]) for measure p and 
semimeasure z/ we get A^t <exp(— Together with h'{x)>w-p{x) Wx this implies 

t=l t=l ^^l-n V /^l-" t=l 

Summing over xi:n and exploiting J2xt^t = 1 we get 1 > ^/wEl[exp{^J2tht)], which 
proves {in). 
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The bound and proof may be generalized to l>w'^Ei[exp{^Y.tJ2xtiK~l-''tY^'^)] 
with < K < I by defining Rt = fil~'^h'^/Nt with A^^ = J2xtf^t~'^K exploiting 

One can show that the constant | in Lemma H] can essentially not be improved. 
Increasing it to a constant a > 1 makes the expression infinite for some (Bernoulli) 
distribution fi (however we choose u). For v = M the expression can become already 
infinite for a > ^ and some computable measure ji. 



4 Non- Convergence in Martin-Lof Sense 

Convergence of M(x„|x<„) to /i(x„|x<„) with /i-probability 1 tells us that M(x„|x<„) 
is close to /i(x„|x<„) for sufficiently large n on 'most' sequences xi:oo- It says 
nothing whether convergence is true for any particular sequence (of measure 0). 
Martin-Lof randomness can be used to capture convergence properties for individ- 
ual sequences. Martin-Lof randomness is a very important and default concept of 
randomness of individual sequences, which is closely related to Kolmogorov com- 
plexity and Solomonoff's universal semimeasure M. Levin gave a characterization 
equivalent to Martin-Lof 's original definition |Lev73j : 

Definition 5 (Martin-Lof random sequences) A sequence uj = cui-.cxi is /i- 
Martin-Lof random (^.M.L.) iff there is a constant c < oo such that M{uJi-n) < 
c-/x(a;i:„) for alln. Moreover, dJu) : = sup„{log ^7'^^^"s-' } < logc is called the random- 
ness deficiency of uj. 

One can show that an M.L.-random sequence xi-oo passes all thinkable effective 
randomness tests, e.g. the law of large numbers, the law of the iterated logarithm, 
etc. In particular, the set of all /x. M.L.-random sequences has /i- measure 1. 

The open question we study in this section is whether M converges to /i (in 
difference or ratio) individually for all Martin-Lof random sequences. Clearly, The- 
orem [3] implies that convergence /x.M.L. may at most fail for a set of sequences with 
/i-measure zero. A convergence M.L. result would be particularly interesting and 
natural for M, since M.L. -randomness can be defined in terms of M itself (Definition 

ED. 

The state of the art regarding this problem may be summarized as follows: 
|Vov87] contains a (non- improvable?) result which is slightly too weak to imply 
M.L. -convergence, |LV97t Thm.5.2.2] and |VL00l Thm.lO] contain an erroneous 
proof for M.L. -convergence, and |Hut03b] proves a theorem indicating that the an- 
swer may be hard and subtle (see |Hut03b] for details). 

The main contribution of this section is a partial answer to this question. We 
show that M.L.-convergence fails at least for some universal semimeasures: 
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Theorem 6 (Universal semimeasure non-convergence) There exists a uni- 
versal semimeasure M and a computable measure /i and a fi.M.L. -random sequence 
a, such that M(a„|a<„) ^ /i(a„|a<„) for n 



oo. 



This implies that also Mn/ fin does not converge (since < 1 is bounded). We do 
not know whether Theorem [HI holds for all universal semimeasures. For the proof 
we need the concept of supermartingales. We only define it for binary alphabet and 
uniform measure /i(x) = A(x) : = 2~^^^^ for which we need it. 

Definition 7 (Supermartingale) 

m : {0, 1}*^1R is a supermartingale :-v^ m{x) > |[m(xO)+m(xl)] for all x G {0, 1}* 

If is a (enumerable) semimeasure, then m:=z//A is a (enumerable) supermartingale. 
We prove the following theorem, which will imply Theorem [61 

Lemma 8 (Supermartingale non-convergence) For the M.L. -random se- 
quence a defined in and the enumerable supermartingale r defined in 
Lemma \E and for any rj,rj' e M and any on a bounded supermartingale R, i.e. 
0<e <R{ai;n) <c<ooWn, it holds that 



Ria<n] 



> 6 or 



R'{ai;n) 



R'ia<n] 



> 6 



(or both) for a non-vanishing fraction ofn, where supermartingale R':=^{R-\-r) and 
some 6>0. 



Proof. We define a sequence a, which, in a sense, is the lexicographically first (or 
equivalently left-most in the tree of sequences) A. M.L. -random sequence. Formally 
we define a, inductively in ?t, = 1,2,3,... by 

a„ = if M(a<„0) < 2"", and a„ = 1 else. (4) 

We know that M(e) < 1 and M(a<„0) < 2^" if a„ = 0. Inductively, assuming 
M(a<„) < 2-"+i for a„ = 1 we have 2""+^ > M(a<„) > M(a<„0) + M(a<„l) > 
<„1) since M is a semimeasure, hence M(a<„l)<2-". Renc^ 

M(ai:„) < 2^" = A(q;i:„) Vn, i.e. a is A.M.L.-random. (5) 

With R and r, also R' := |(i? + r) > is a supermartingale. We prove that the 
Theorem holds for infinitely many n. It is easy to refine the proof to a non- vanishing 
fraction of ra's. Assume that — > r] for ra— >oo (otherwise we are done). t]>1 
implies R^oo, r]<l implies R—^0. Since R is bounded, i] must be 1, hence for 
sufficiently large uq we have \R{ai;n)—R{a<n)\<£ for all n>nQ. 

^Alternatively we may define a„ = if M(0|a<t)< | and an — I else. 
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Assume r G {0,|,1} and r{ai;n) = \ for infinitely many n and r(ai:„) = 1 for 
infinitely many n (e.g. take r as defined in Lemma [9]). Since R stabilizes and r 
oscillates, R! cannot converge. Formally, for (the infinitely many) n>nQ for which 
r(a<n) = I and r(ai;„) = 1 we have 

R'{ai:n) ^ Rjai-.n) - R{o!<:n) + r{ai;n) - r(a<n) ^ -g + I > s > 
R'{oi<n) i?(a<„) + r(a<„) ~ c + i ~ 

for sufficiently small e and 5. Similarly for (the infinitely many) n>no for which 
r(a;<„) = 1 and r(ai:„) = | we have 

^ _ i^'(ai:n) ^ i?(a<„) - i?(ai:n) + r(a<„) - r(ai,„) ^ -g + | > ^ > g 
-R'(a<n) -R(a<n) + '"(a<n) ~ c+1 ~ 

This shows that Lemma[8]holds for infinitely many n. If we define r zero off-sequence, 
i.e. r{x) = for Xy^ai-^^x), then r is a super martingale, but a non-enumerable one, 
since a is not computable. In the next lemma we define an enumerable super- 
martingale r, which completes the proof of Lemma [H Finally note that we could 
have defined R' = with arbitrarily small 7 > 0, showing that already a small 
contamination can destroy convergence. This is no longer true for the constructive 
proof below. □ 



Lemma 9 (Enumerable supermartingale) Let M* with t = 1,2,3,... be com- 
putable approximations of M, which enumerate M, i.e. M*(x) /" M{x) fort— >-oo. 
For each t define recursively a sequence a* similarly to as a^ = z/M*(a^„0) < 
2~" andal^ = l else. Foreveni{x) we define r{x) = \ if3t^n\x = Q^^^ and r{x) = else. 
For odd i{x) we define r{x) = j[r{xO)+r{xl)]. r is an enumerable supermartingale 
with r{ai:n) being 1 and | for a non-vanishing fraction ofn's, where a = limf^ooQ^* 
(a^/^ a lexicographically increasing). 

The idea behind the definition of r is to define r(a<„) = 1 for odd n and if possible 
I for even n. The following possibilities exist for the local part of the sequence tree: 

r{x) 1/2 1/2 1 1 1 1 

A = A , i(x) odd A or A or A , and iix) even A or A or A , 

r{xO) r{xl) ' ' 1 1 1 1 1/2 1/2 1/2 1/2 

all respecting the supermartingale property. The formal proof goes as follows: 

Proof, r is enumerable, since is computable. Further, 0<r(x)<lVx. For odd 
£{x) the supermartingale property r(a;) > |[r(xO)+r(xl)] is obviously satisfied. For 
even £{x) andx = a^„ for some t we have r(x) = 1 = + > |[r(xO)+r(xl)]. Even 
£{x) and X7^a^„ Vt implies xyj^a\.^(^^ys^ ^t,y, hence r(x) =0 = |[0+0] = |[r(a;0)+r(2;l)]. 
This shows that r is a supermartingale. 

Since M* is monotone increasing, a* is also monotone increasing w.r.t. to lex- 
icographical ordering on {0,1}°°. Hence a\.n converges to ai:„ for t ^ oo, and 
even a\.^ = cumVt > tn and sufficiently large (n-dependent) t„. This implies 
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r(a<„) = r(a^„) = 1 for odd n. We know that a„ = for a non-vanishing frac- 
tion of (even) n, since a is random. For such n, a^ = OVt, hence 
|[r(Q;^^0)-t-r(a^„l)] = ^[H-0] = |. This shows that r(a<„) = 1 (|) for a non- vanishing 
fraction of tt,, namely the odd ones (the even ones with a„ = 0). □ 

Nonconstructive Proof of Theorem [61. Use Lemma[8]with R:=M/\, R' : = M' / X, 
r=:q/\, hence q is an enumerable semimeasure, hence with M, also M' = ^{M+q) is 
a universal semimeasure. -R(ai:n)<l from and i?(x)>c>0 from universality of 
M and computability of A show that the conditions of Lemma [8] are satisfied. Hence 
-R^'''(ai:n)/-R^'-'(tt<n) = M(')(a„|a<„)/A(a„|a<„)74l. Multiplying this by A„ = /x„ = i 
completes the proof. □ 

The proof of Theorem [6] is non-constructive. Either M or M' (or both) do not 
converge, but we do not know which one. Below we give an alternative proof which 
is constructive. The idea is to construct an enumerable (semi) measure u such that 
u dominates M on a, but z^(a„|a<„) |. Then we mix M to z/ to make z/ universal, 
but with larger contribution from z/, in order to preserve non-convergence. 

Constructive Proof of Theorem [G]. We define an enumerable semimeasure v as 
follows: 

!2~* if £{x) = t and x < a*.^ 

if £(x) = t and x > a\.^ 

n -e nl \ ^ ~ (6) 

if £{x) > t ^ ' 

i^\xO) + i^\xl) if i{x) < t 

where < is the lexicographical ordering on sequences, and a* has been defined in 
Lemma M z/* is a semimeasure, and with a* also z/* is computable and monotone 
increasing in t, hence z/: = lim(^oo^* is an enumerable semimeasure (indeed, ^||y is a 
measure). We could have defined a by replacing a\.^ with a"^ in Since z/j„ is 
monotone increasing in t and n, any order of t,n—>- oo leads to u, so we have chosen 
arbitrarily t = n. By induction (starting from £{x)=t) it follows that 



*(x) =2 ^^^^ if X < and £(x) < t, z/*(x) = if x > 



On-sequence, i.e. for x = z/* is somewhere in-between and 2~^^^\ Since 
sequence a := limja* is A. M.L. -random it contains 01 infinitely often, actually 
anOn+i = 01 for a non-vanishing fraction of n. In the following we fix such an 
n. For t>n we get 

^\o!<n) = z/*(a<„0) + z/*(a<J^) = z^*(a<nO) = z/*(ai:„) ^ z/(a<„) = z/(ai:„) 

>ai:„>Q:5^.^, since a„=0 

This ensures z/(a„|a<„) = l7^i = A„. For t>n large enough such that a\.^^i = ai:n+i 
we get: 

u\a,..n) = iy\aiJ > = 2'^-' z/(ai.„) > 2-"-^ 
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This ensures z^(ai:n) >2^"~^ > jM{ai:n) by ([5]). Let M be any universal semimeasure 
and 0<7<|. Then M'(x) := (1— 7)z/(x)+7M(x) Vx is also a universal semimeasure 
with 

M(a<„) < 2-"+i and M(ai:„) > 

..,f I X (l-7)z/(ai;n) +7M(ai:„) ^ (l-7)z/(ai:„) 
M(a„|a<„) = — — — — > 



(l-7)z/(a<„) +7M(a<„) " (l-7)z/(a<„) + 72- 
1-7 y 1-7 ^ 1 



-n+l 



l-7 + 72-"+Vz/(«i:„) - 1 + 37 2" 
T T 

For instance for 7 = ^ we have M'(a„|a<„) > | 7^ | = A(Q;„|a<„) for a non-vanishing 
fraction of n's. Note that the contamination of M with u must be sufficiently large 
(7 sufficiently small), while an advantage of the the non-constructive proof is that 
an arbitrarily small contamination sufficed. □ 

A converse of Theorem [6] can also be shown: 

Theorem 10 (Convergence on nonrandom sequences) For every universal 
semimeasure M there exist computable measures fi and non-fi.M.L. -random se- 
quences a for which M(a„|a;<„)//i(Q;„|a;<„)— >1. 



5 Convergence in Martin-Lof Sense 

In this section we give a positive answer to the question of predictive M.L.- 
convergence to fi. We consider general finite alphabet X. 

Theorem 11 (Universal predictor for M.L. -random sequences) There ex- 
ists an enumerable semimeasure W such that for every computable measure fi and 
every fi. M.L. -random sequence u, the predictions converge to each other: 

W{a\u!^t) —-^ f^{cL\^<t) for all a E X if d^^u) < 00. 

The semimeasure W we will construct is not universal in the sense of dominating 
all enumerable semimeasures, unlike M. Normalizing W shows that there is also a 
measure whose predictions converge to /i, but this measure is not enumerable, only 
approximable. For proving Theorem [TT] we first define an intermediate measure D as 
a mixture over all computable measures, which is not even approximable. Based on 
Lemmas |4|12II13[ Proposition [T3] shows that D M.L. -converges to /i. We then define 
the concept of quasimeasures in Definition [T5] and an enumerable semimeasure W as 
a mixture over all enumerable quasimeasures. Proposition [TSl shows that W M.L.- 
converges to D. Theorem [TT] immediately follows from Propositions fT^ and fT8l 
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Lemma 12 (Hellinger Chain) Let h(p,q) '■=Y^f=i{y/jH — y/oiY' ^he Hellinger 
distance between p={pi)fLi&]R^ and q={qi)f^^&]R^ . Then 

i) forp,q,reR^ h{p,q) < {1 + /3) h{p,r) + {1 + (3"^) h{r,q), any (3 > 

m 

ii) forp\...,p'^ e h{p\p"') < 3^, k"^ h{p^~\p^) 

k=2 



Proof, {i) For any x,y,z&lR and /9 > 0, squaring tlie triangle inequality |x — y| < 
|a;— + 1/| and chaining it with the binomial 2|x — ?/| < l3{x — z)"^ + j3~^ {z —yY 
shows (x—yY<{l+P){x—z)'^ + {l+P~^){z—y)'^. (i) follows for a; = ^y^, y = y/qi, and 
z = y/ri and summation over i. 

(ii) Applying [i) for the triples [p^ ,p^^^ ^p^) for and in order of = l,2,...,m — 2 
with [3 = l3k gives 



m r k—2 
k=2 ^j=l 



■(l+/3,_i)-M/-\/) 



For /3k = k{k+l) we have InU'^zHl + [3^') < ET=M^ + /3i^) < T.T=il3f = 1 and 
l + /?fc-i < which completes the proof. The choice (3k = 2^^''^ would lead to a 
bound with 1 + 2-^^'') instead of /c^. □ 

We need a way to convert expected bounds to bounds on individual M.L. random 
sequences, sort of a converse of "M.L. implies w.p.l". Consider for instance the 
Hellinger sum H (u) ■. = Y^'^-^^ht{fi,p) /liaw~^ between two computable measures p>w-fi. 
Then H is an enumerable function and Lemma H] implies E[if] < 1, hence H is 
an integral //-test. H can be increased to an enumerable /x-super martingale H. 
The universal /x-supermartingale M/fi multiplicatively dominates all enumerable 
supermartingales (and hence H). Since M//i<2'^^^'^\ this implies the desired bound 
i/(a;)^2'^''(^) for individual oo. We give a self-contained direct proof, explicating all 
important constants. 

Lemma 13 (Expected to Individual Bound) Let F{uj) >0 be an enumerable 
function and fi be an enumerable measure ande>0 be co- enumerable. Then: 

If E^[F] <e then ^ e.2^('^'^'V^)+dMH Vcj 

where d^{uj) is the fi-randomness deficiency of uj and K{fi,F, ^/e) is the length of the 
shortest program for p, F, and ^/e. 

Lemma [13] roughly says that for fi, F, and e=E^[F] with short program 
(ir(/i,F,Ve)=0(l)) and /i-random u (rf^(cj) = 0(1)) we have F{u)^E^[F]. 

Proof. Let F{lj) = \imn^ooFn{uj) = sup„F„(ci;) be enumerated by an increasing 
sequence of computable functions Fn{uj). Fn^uj) can be chosen to depend on u!i:n 
only, i.e. F„(a;) = F„(ci;i:„) is independent of ujn+i-.oo- Let er\e co-enumerate e. We 
define 
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/in(c^l:fc) ■= <^n^ ^ fi{uJi;n) Fn{uJi;n) for k < 71, and = for k > n. 

fin is a computable semimeasure for each n (due to E^[F„] <e) and increasing in n, 
since 

fini^^i-.k) > = fin-iiuJi;k) for /c > n and 

> Fn-i fi measure £« < £n-i 

and similarly for k<n—l. Hence /i: = /2oo is an enumerable semimeasure (indeed /i 
is proportional to a measure). From dominance ([2]) we get 

M(^l.„) 5 2-^(^)/i(^i.„) > 2"^(^)/i„(a;i.„) = 2-^(^)£;V(^l:n)i^n(^l:n). (7) 

In order to enumerate ft, we need to enumerate n, F, and e~^, hence 
K{fL)^K{jjL,F, Ye), so we get 

F^{u) ^ F„(.;i.„)^^n-2^('^'^'/^)-^ < ,„.2W,V.)+^.H. 

Taking the limit Fn/" F and e„\ e completes the proof. □ 

Let A1 = {z/i,i/2,.--} be an enumeration of all enumerable semimeasures, Jfc:={2< 
k : Vi'is measure}, and 5k{x) ■ = J2i(£jf.^i^i{x)- The weights Ei need to be computable 
and exponentially decreasing in i and Y^'^iei<l. We choose ei = i~^2~\ Note the 
subtle and important fact that although the definition of Jk is non-constructive, as 
a finite set of finite objects, Jk is decidable (the program is unknowable for large k). 
Hence, 6k is computable, since enumerable measures are computable. 

D{x) = Soo{x) = ^ £ii^i{x) = mixture of all computable measures. 

In contrast to Jk and 6k, the set Joo and hence D are neither enumerable nor 
co-enumerable. We also define the measures Sk{x) := Sk{x)/6k{e) and Z)(x) := 
D{x)/ D{e). The following Proposition implies predictive convergence of D to fi 
on /x-random sequences. 

Proposition 14 (Convergence of incomputable measure D) Let n be a com- 
putable measure with index ko, i.e. ^ = Vko- Then for the incomputable measure D 
and the computable but non- constructive measures 6ko defined above, the following 
holds: 

TT=iht{Kvli) ^ 21n2-c/^(^) + 3fco 
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Combining (i) and (n), using Lemma [TWi). we get Y^'^iht{^,D) <Cuj f (ko) <oc for 
/i-random which imphes D(6|ct;<f)=D(6|ct;<t)— s>yu(6|co'<t). We do not know whether 
on-sequence convergence of the ratio holds. Similar bounds hold for 6ki instead 6ko, 
ki > ko- The principle proof idea is to convert the expected bounds of Lemma H] 
to individual bounds, using Lemma [131 The problem is that D is not computable, 
which we circumvent by joining with Lemma [T2| bounds on J2tht{Sk-i,Sk) for k = 
ko,ko + l,.... 

Proof, (z) Let H{uj) ■=J2'^iht{^ko,fJ')- and are measures with > 5a:o ^ 
efcpyU, since 5fc(e) < 1, /i = i^ko and k^ G Jko- Hence, Lemma S] applies and shows 
E^[exp(iiy)] <£^|y^. is well-defined and enumerable for dn{uj) < oo, since (i^(co') < 
oo implies fi{uji;t) ^ implies ^^^(a;!:^) ^ 0. So fj,{b\uji;t) and 5j!co(&|i^i:t) are well 
defined and computable (given J^g). Hence ht{6ko,fJ') is computable, hence H{u) 
is enumerable. Lemma [T3l then implies exp{^H{Lj))^e^^^'^ ■2^^'^'^'^'^o'^~^'^''^^\ We 
bound 

K{fi, H, V^J t K{H\fi, ko) + K{ko) ^ K{Ju,\ko) + K{ko) ^ko + 2 log A^q. 

The first inequality holds, since ko is the index and hence a description of /i, and 
£() is a simple computable function. H can be computed from /x, fco and J^^, which 
implies the second inequality. The last inequality follows from K{ko)<'2\ogko and 
the fact that for each i<ko one bit suffices to specify (non) membership to J^^, i.e. 
K{Jko\ko)<ko. Putting everything together we get 

H{uj) ^ \ne^l + [ko + 2\ogko + d^{uj)]2\n2^ (2 ln2)d^(tu) + 3A;o- 

{ii) Let H''{lj) ■. = J2tLiht{Sk,Sk-i) and k>ko. Sk-i<Sk implies 

^ Skje) ^ (5fc-i(e) + Sk _ ^ _^ gfc < ]^ _^ £i 
~ 4-i(e) ~ 5fc-i(e) 5fc--i(e) ~ £o' 

where O := min{z G Ja;_i} = 0(1). Note that Jk-i3ko is not empty. Since 6k-i and 
are measures. Lemma H] applies and shows E^^ Jif'^] < In (H-|^) < Exploiting 
£kofJ' < "^fc-i; this implies E^[if''] < Lemma [13] then implies H^{u)^j^^ ■ 

2K{ti,H'',eoeko/^k)+d^H_ Similarly as in (i) we can bound 

K{ij.,H\ekJeoek)^K{Jk\k) + K{k) + K{ko)^k + 2\ogk + 2\ogko, hence 
H^{uj)^ ^^■klk'^2^c^^kl2''°k-^c^, where := 2'^''(^). 
Chaining this bound via Lemma we get for ki>ko: 

n n ki 

Y.hti5k,,5k,) < J2^J2(k-ko+lfht{5k-i,5k) 

t=l t=l k=ko+l 

ki ki 

< 3 J2 k'^H\uo) ^ 3A;g2^«c^ ^ k-^ < 3A;^2'=«c^ 

k=ko+l fc=fco+l 
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If we now take ki^oo we get Er=i^t(4o,^)^3A;^2'^«+^''('^). Finally let n^oo. □ 

The main properties allowing for proving fi were that £) is a measure with 
approximations 6k, which are computable in a certain sense. £) is a mixture over all 
enumerable/computable measures and hence incomputable. 

6 M.L.- Converging Enumerable Semimeasure W 

The next step is to enlarge the class of computable measures to an enumerable class 
of semimeasures, which are still sufficiently close to measures in order not to spoil 
the convergence result. For convergence w.p.l. we could include all semimeasures 
(Theorem [3]). M.L. -convergence seems to require a more restricted class. Included 
non-measures need to be zero on long strings. We define quasimeasures as nearly 
normalized measures on X-^. 

Definition 15 (Quasimeasures) u-.X* is called a quasimeasure iff z/ zs a 

measure or: ^^Zai^x^i.^^^) = for i{x) < n and v{x) = for i{x) > n and 1 — ;^ < 
^{^) < 1; for some nEiN . 

Lemma 16 (Quasimeasures) {i) A quasimeasure is either a semimeasure which 
is zero on long strings -or- a measure, {ii) The set of enumerable quasimeasures is 
enumerable and contains all computable measures. 

For enumerability it is important to include the measures in the definition of 
quasimeasures. One way of enumeration would be to enumerate all enumerable par- 
tial functions / and convert them to quasimeasures. Since we need a correspondence 
to semimeasures, we convert a semimeasure u directly to a maximal quasimeasure 
z/< z/. 

Proof & construction, (i) Obvious from Definition [T5l 

(a) Let u be an enumerable semimeasure enumerated by z/^z/. Consider m = 
m* :=max{n<t : I]xi.„'^*(^i:n) > is finite and monotone increasing in t. 

We define the quasimeasure 

P*(a^i:n) := X! ^^{^i-.m) for n <m and p*(xi:„) = for n > m. 

We define an increasing sequence in t of quasimeasures z/* <z/* for t= 1,2,... recursively 
starting with z/'':=0 as follows: 

If p*(xi:„) > u^^^{xi;n) Vxi:„Vn < m* (and hence Vx), then z/* := p*, else z/* := z>*^^. 

D ■. = \im.t^^u^ is an enumerable quasimeasure. Note that m°° = oo iff z/ is a measure. 
One can easily verify that V <v and z/ = z/ iff z/ is a quasimeasure. This implies 
that if z/i,z/2,... is an enumeration of all enumerable semimeasures, then z/i,z>2v is 
an enumeration of all enumerable quasimeasures. □ 
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Let z>i,z/2,... be the enumeration of all enumerable quasimeasures constructed in 
the proof of Lemma [161 based on the enumeration of all enumerable semimeasures 
1^1,1/2,... with the property that i'i<i'i and equality holds if Vi is a (quasi)measure. 
We define the enumerable semimeasure 



oo 



:= ^£:jZ/j(x), and note that = ^£:jZ/j(x) with J := {i : i^j is measure} 

with ei = i~^2~'^ as before. To show W D need the following Lemma. 



Lemma 17 (Hellinger Continuity) For hx{li,v) := J2aexi\/ /^(o-la^) ~ > 
where p{y) = fi{y) +h'{y) \lyEX* and fi and v are semimeasures, it holds: 



ii) hx{fi, p) < if u{x)<e-p,{x) and u{xh) < e- p,{xh)\/h E X . 



(a) Since the Hellinger distance is locally quadratic, hx{fi,p) scales quadratic in 
the deviation of predictor p from fi. (i) Closeness of p{x) to fi{x) only, does not 
imply closeness of the predicitons, hence only a bound linear in the deviation is 
possible. 

Proof, (i) We identify X = {1,...,N} and define yi = p,{xi), Zi = h'{xi), y = p,{x), 
and z = i'{x). We extend {yi)fL-^ to a probability by defining l/o = y^Z]ilil/i>0 and 
set Zq = 0. Also e' := z/y. Exploiting Y^f^Qyi = y and Y^iLo^i < z and z <ey and 

yuZi,y,z>0 we get 

N / I I \ \ 2 N / I I \ \ 2 



hx{p.,n+y) = H \/^^T~^l ^ H 

y \ y+z ) ^ 



y+z 



m{m+z.)\ ^ o of; = 2-^1= < e'. 



y^fyi^yi±^_ 

i=o\y y+^ ^\ viy+z) J ~ ^ \ly{y+z) " ^1+^ 

{ii) With the notation from (i), additionally exploiting Zi<eyi we get 



1/+^ \y ~ Vy ~ Vy ~2\/?/ 

j yi+Zj _ ^Jy^jl+e') - y/yi + Zj ^ ^Jyjjl+e') - ^ ^ 
V ~ ^Jy{l+e') ~ ^y{l+e') ~ 2 

Exploiting e'<e, taking the square and summing over i proves {ii). □ 

Proposition 18 (Convergence of enumerable W to incomputable D) For 

every computable measure p, and for uj being pi-random, the following holds for 
t—*oo: 

l^^^l, (u) ^^p^^i^ Wia\co^t)^Dia\co^t) Va G A". 

D{uJi:t) D{uJt\uj.^t) 
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The intuitive reason for the convergence is that the additional contributions of 
non-measures to W absent in D are zero for long sequences. 

Proof, (i) 

D{x) < W{x) = D{x) + Y.eii)i{x) < D{x) + J2 ^M^)^ (8) 

where fc^; : = minj{i ^ J: z/j(x) 7^0}. For i^J, Vi is not a measure. Hence t'i(x)=0 for 
sufficiently long x. This implies k^^oo for £(x)^oo, hence W{x)^ D{x) Va;. To 
get convergence in ratio we have to assume that x = uji-n with uj being yU-random, 
I.e. c^. sup„^^^^^^ z ^o<j. 

Hx) < Jy^{x) < —M{x) < —l^{x) < -^^D{x), 

The last inequality holds, since is a computable measure of index k^, i.e. fi = i'ko = 
z>fc„. Inserting l/w,y.<c' -P for some c = 0(l) and Ei we get EiUt^x) <^-^i~^2~W(x), 
which implies J2iZk^^i^i{x) ^^x^i^) with 

r'r °° 9rV 

4 := ^ ^ r^2-* < ^A;;42-/c. ^ q ^^^^ _^ ^_ 

Inserting this into (IE]) we get 

1 < — -— < I + — > 1 for u-random x. 

D{x) ^ 

(ii) Obvious from (i) by taking a double ratio. 

(m) Since D and W — D are semimeasures and ^-i^^ < by (i), Lemma fTTTi) 
implies hx{D,W) <e'^. Since — > for /i-random x, this shows (in). |iy(a|x) — 
-D(a|x)| <s'^ can also be shown. 

Speed of convergence. The main convergence Theorem [11] now immediately 
follows from Propositions [H] and [181 We briefly remark on the convergence rate. 
For M, LemmaHlshows that E[X]t/it(M,/i)] <\n.w~j^^=hiko is logarithmic in the index 
fco of /i, but 'Ei\^^ht{X ,ii)]<\Yieko—kQ is linear in fco for X = \W,D,5ko\. The individual 
bounds for J^thti^ko^f'') and Y,tht{5ko,D) in Proposition [TH are linear and exponential 
in /cq, respectively. For W^^D we could not establish any convergence speed. 

Finally we show that W does not dominate all enumerable semimeasures, as the 
definition of W suggests. We summarize all computability, measure, and dominance 
properties of M, D, D, and W in the following theorem: 

Theorem 19 (Properties of M, W, D, and D) 

(i) M is an enumerable semimeasure, which dominates all enumerable semimea- 
sures. M is not computable and not a measure. 

[ii) D is a measure, D is proportional to a measure, both dominating all enumerable 
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quasimeasures. D and D are not computable and do not dominate all enumerable 
semimeasures. 

(iii) W is an enumerable semimeasure, which dominates all enumerable quasimea- 
sures. W is not itself a quasimeasure, is not computable, and does not dominate all 
enumerable semimeasures. 

We conjecture that D and D are not even approximable (limit-computable), but 
lie somewhere higher in the arithmetic hierarchy. Since W can be normalized to 
an approximable measure M.L.-converging to fi, and D was only an intermediate 
quantity, the question of approximability of D seems not too interesting. 

Proof, (z) First sentence: Holds by definition. That such an M exists follows from 
the enumerability of all enumerable semimeasures [ZL70[ [LV97j . Second sentence: 
If M were a measure it would be computable, contradicting |Hut03bt Th.m.4{iii)] 
(see below). 

(ii) First sentence: Follows from the definition of D and D and the fact that 
quasimeasures are zero on long strings: > > if is a computable measure. If z/ 
is a "proper" quasimeasure, then mmx(zX''^^^ = Toa.mx:i(x)<rn^^^ >0, since iy{x) = 
for i{x)>m^ <oo, and D{x)>0\fx. Second sentence: It is well known that there is 
no computable semimeasure dominating all computable measures (see e.g. |Hut03bl 
Thm.4]), which shows that D, D and W cannot be computable. We now show that D 
and W do not dominate the enumerable semimeasure M by extending this argument. 
Let be a nowher^zero computable semimeasure. We define a computable sequence 
a as follows by induction: Given a<„, choose some a„ in a computable way (by 
computing z/ to sufficient accuracy) such that z/(a„|a<„) < |A'|~"'^(1-|-^). Such an 
a„ exists, since u is a semimeasure. We then define the computable deterministic 
measure z/ concentrated on a, i.e. z>(ai:„) = 1 Vn and u{x) = for all x which are not 
prefixes of a. By the chain rule we get z/(ai:„) < Af]-" < 4| A'|~"z/(ai:„). This 
shows that no computable semimeasure z/ can dominate all computable measures, 
since u is not dominated. We use this construction for v = 5k'. 

for sufHciently large n — rik M$2~^^''V 
k i ^ _ 

i=l 

^ \X\~''k^2''M{ai.n) < P2-'=M(ai.„). (9) 
T . 

mr<Kis,r<k+2io,k 

— log \X\ 

For all X we have 



oo 



D{x)-6k{x) < = E^"'2-V,(x) < 2-^= E rV,(x) < 2-^M(x). 

i=k+l i=k+l i=k+l 



^M, W, D, D, and 5k for k > 0{1) are nowhere zero. Alternatively one can verify that all 
relevant assertions remain valid if v is somewhere zero. 
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Summing both bounds we get D{ai;nf^) < W{ai;n^)<{k'^ + l)2''''M{ai;nf,), which 
shows that D, D and W do not dominate the enumerable semimeasure M. 

Remark: Note that the constructed sequence(s) a depends on the choice of k, 
so we should write more precisely a = a'^. For D (but not for W) we can choose 
k = ^\og\X\ in (satisfying n>j^^k), leading to D{aU^n^\X\-''/^M{aU. It 
is easy to generalize to Vx<j3q;j:„ : 5fc(a;<tat:„)^| A'|*~"'fc^2'^M(x<taf:„), where t is 
a simple function of k. Choosing t = k'^ + l and n = (A;+l)^ and joining the results for 
k = l,2,... and x<t: = a<t we get D(ai:„)<n2~^M(ai:„) Vn for the single sequence a. 
This implies that (but is stronger than) a is not random w.r.t. to any computable 
measure 9. Such a are sometimes called absolutely non-stochastic. 

{Hi) First sentence: Enumerability is immediate from the definition, given the 
enumerability of all enumerable quasimeasures. Second sentence: Since quasimea- 
sures drop out in the mixture defining W for long x, W cannot be a measure. Since 
W{x) ^O^x it is also not a quasimeasure. Non-computability and non-dominance 
of W have already been shown in (ii) . □ 



7 Conclusions 

We investigated a natural strengthening of Solomonoff 's famous convergence theo- 
rem, the latter stating that with probability 1 (w.p.l) the prediction of a universal 
semimeasure M converges to the true computable distribution jj, (M^^^^jj,). We an- 
swered partially negative the question of whether convergence also holds individually 
for all Martin-Lof (M.L.) random sequences {3M:M^- fi). We constructed ran- 
dom sequences a for which there exist universal semimeasures on which convergence 
fails. Multiplicative dominance of M is the key property to show convergence w.p.l. 
Dominance over all measures is also satisfied by the restricted mixture W over all 
quasimeasures. We showed that W converges to n on all M.L.-random sequences by 
exploiting the incomputable mixture D over all measures. For D ^—^ jj, we achieved 
a (weak) convergence rate; for W^^D and W/D^^^l only an asymptotic result. 
The convergence rate properties w.p.l. of D and W are as excellent as for M. 

We do not know whether D/fi^^^l holds. We also do not know the conver- 
gence rate for W —^D, and the current bound for D ^ is double exponentially 
worse than for M^^^/i. A minor question is whether D is approximable (which is 
unlikely). Finally there could still exist universal semimeasures M (dominating all 
enumerable semimeasures) for which M.L. -convergence holds (3M:M'-^/x?). In 
case they exist, we expect them to have particularly interesting additional structure 
and properties. While most results in algorithmic information theory are indepen- 
dent of the choice of the underlying universal Turing machine (UTM) or universal 
semimeasure (USM), there are also results which depend on this choice. For in- 
stance, one can show that {(a;,n) : Ku{x) <n} is tt-complete for some f/, but not 
tt-complete for others |MP02j . A potential U dependence also occurs for predic- 
tions based on monotone complexity |Hut03dj . It could lead to interesting insights 
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to identify a class of "natural" UTMs/USMs which have a variety of favorable prop- 
erties. A more moderate approach may be to consider classes Ci of UTMs/USMs 
satisfying certain properties Vi and showing that the intersection fljCj is not empty. 

Another interesting and potentially fruitful approach to the convergence problem 
at hand is to consider other classes of semimeasures A4, define mixtures M over Ai, 
and (possibly) generalized randomness concepts by using this M in Definition [51 
Using this approach, in |Hut03bj it has been shown that convergence holds for a 
subclass of Bernoulli distributions if the class is dense, but fails if the class is gappy, 
showing that a denseness characterization of could be promising in general. 

Acknowledgements. We want to thank Alexey Chernov for his invaluable help. 
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